On the use of a spatial cue as prior information for stereo sound source separation based on spatially weighted non-negative tensor factorization

نویسندگان

Yuki Mitsufuji

Axel Röbel

چکیده

This paper proposes a new method to enhance the performance of non-negative tensor factorization (NTF), one of the most prevalent source separation techniques nowadays. The enhancement is mainly achieved by introducing weights on bin-wise NTF cost functions, which differentiates NTF target components from other components so that the target should be approximated more precisely than others. Assuming sources are distributed sparsely in a 2-D sound field, the target components approximating a target source are exclusively selected by a user, or from accompanying images by means of providing a spatial cue to an NTF framework. The spatial cue is given in a similar format to the well-known binaural feature, inter-channel level difference (IID). This helps incorporate the spatial cue into the system, since the similar features of this format can be easily calculated from every spectrogram bin. The weighting functions are designed taking into account the distance between the spatial cue and the calculated features. Namely, the largest values are assigned to the spectrogram bins where the features present the highest similarity to the spatial cue, and the value decreases in proportion to the distance between them. The method is evaluated in terms of separation quality, comparing the proposed algorithm to the conventional NTF technique, PARAFAC-NTF, as well as other source separation techniques. The evaluation results measured by the metric signal-to-distortion ratio (SDR), signal-to-interference ratio (SIR), and signal-to-artifact ratio (SAR) demonstrate the effectiveness of the new method, improved primarily by the weighting function and the initialization based on IID, while demonstrating a decrease in computational costs, a significant problem with NTF.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Iterative Weighted Non-smooth Non-negative Matrix Factorization for Face Recognition

Non-negative Matrix Factorization (NMF) is a part-based image representation method. It comes from the intuitive idea that entire face image can be constructed by combining several parts. In this paper, we propose a framework for face recognition by finding localized, part-based representations, denoted “Iterative weighted non-smooth non-negative matrix factorization” (IWNS-NMF). A new cost fun...

متن کامل

On-Line NMF-Based Stereo Up-Mixing of Speech Improves Perceived Reduction of Non-Stationary Noise

Speech de-noising algorithms often suffer from introduction of artifacts, either by removal of parts of the speech signal, or imperfect noise reduction causing the remaining noise to sound unnatural and disturbing. This contribution proposes to spatially distribute monaural noisy speech signals based on single-channel source separation, in order to improve the perceived speech quality. Stereo u...

متن کامل

A social recommender system based on matrix factorization considering dynamics of user preferences

With the expansion of social networks, the use of recommender systems in these networks has attracted considerable attention. Recommender systems have become an important tool for alleviating the information that overload problem of users by providing personalized recommendations to a user who might like based on past preferences or observed behavior about one or various items. In these systems...

متن کامل

Spatio-temporal analysis of the covid-19 impacts on the using Chicago urban shared bicycles by tensor-based approach

Cycling is a phenomenon in urban transportation that has the ability to allocate a specific location at any moment in time. Accordingly, spatial analysis of bicycle trips can be accompanied by temporal analysis. The use of a GIS environment is commonly recommended to display the extent of the phenomenon's spatial changes. However, in order to apply and display changes over time, it will requir...

متن کامل

Unsupervised Learning Methods for Source Separation in Monaural Music Signals

Computational analysis of polyphonic musical audio is a challenging problem. When several instruments are played simultaneously, their acoustic signals mix, and estimation of an individual instrument is disturbed by the other co-occurring sounds. The analysis task would become much easier if there was a way to separate the signals of different instruments from each other. Techniques that implem...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

EURASIP J. Adv. Sig. Proc.

دوره 2014 شماره

صفحات -

تاریخ انتشار 2014

On the use of a spatial cue as prior information for stereo sound source separation based on spatially weighted non-negative tensor factorization

نویسندگان

چکیده

منابع مشابه

Iterative Weighted Non-smooth Non-negative Matrix Factorization for Face Recognition

On-Line NMF-Based Stereo Up-Mixing of Speech Improves Perceived Reduction of Non-Stationary Noise

A social recommender system based on matrix factorization considering dynamics of user preferences

Spatio-temporal analysis of the covid-19 impacts on the using Chicago urban shared bicycles by tensor-based approach

Unsupervised Learning Methods for Source Separation in Monaural Music Signals

عنوان ژورنال:

اشتراک گذاری